How I built this blog (Part 2)

Choosing a Blogging Platform

Since Python is the language I'm the most comfortable in, I want a blogging platform that is based on python in case I ever need to try customizing something. My first thought when looking for a blogging platform was to build out a blog using Django following one of the numerous tutorials for building a blog with Django, since I'm very interested in building out a Django site in the future. Django would offer the ability to offer a dynamic site with a live database and build dashboards or other interactive visualizations that could be dynamically updated over time. On the other hand, the benefits of a dynamic site are purchased with considerably increased complexity. Since I am much more interested in writing a blog than I am in writing blogging software and the benefits of a dynamic site do not appear immediately useful, I settled on the idea of doing something much simpler, which meant using a static site.

Having decided on using a static site, the choice of python static blog generator platforms gets basically comes down to Pelican vs Nikola (no, not that Nikola). Both platforms support rich sets of features including plugins, themes, and extensions. Both options have all the features we identified earlier in our requirements:

Feature Nikola Pelican
Easy static sites? Y Y
Comments? Y Y
Custom themes? Y Y
LaTeX support? Y Y
Markdown? Y Y
Jupyter notebooks? Y Y
Publish with git? Y Y

In point of fact, I've only discovered two small differences between the two packages.

First, Nikola offers a neat incremental build system that only rebuilds existing content if the source for that post has changed, whereas Pelican appears to rebuild every post every time. In theory the incremental builds could be a real time-saver if one had a large number of posts.

The second, and potentially more significant difference between the two platforms comes in the user base. As of the time of this writing, Pelican has 10k stars on github and Nikola has only 2k. To be sure, both projects have established user bases and have ongoing development and support, but I have a strong preference for the larger user base. A larger user base means more people out there writing plugins, documentation, and answering stack overflow questions, and that's what I want.

Update 12/20/2020 Although, I initially chose Pelican because of it's larger userbase, I ended up changing my mind because although Pelican does have plugins that support notebooks, Pelican itself does not do so out of the box. Both plugins I tried (pelican-jupyter and ipython) do not appear to be actively maintained. So, I ended up very quickly trapped in the realm of hunting through documentation and trying to resolve version conflicts. That was the opposit

How I Built This Blog (Part 1)

I intend to use this blog to post some of my personal data science projects and to work out some of my ideas. The primary goal is to keep myself writing code and building a portfolio of data science projects, but hopefully others will find something useful to read here as well.

As my first series of posts, I'm going to write about the project of creating the blog itself.

Let's begin by defining our requirements and then we can look to see what technologies we might adopt to fulfill these requirements.

  • The first and most important requirement is that we want a python-based blogging platform easy to use. We want to spend our time writing content, not twiddling with configuration files or so on. Some additional features that would be nice to have in our blogging platform would be
    • the ability for readers to leave comments or subscribe to future posts,
    • use custom themes for styling the appearance, and
    • ability to support LaTeX-style mathematics, preferably with KaTeX
  • Second, we want to be able to keep all of our source code and other content in version control to ensure we have good offsite backups and to give us flexibility to move to a different platform in the future if our requirements change. I use git for my version control, so I'd like to be able to publish content to the site via git.
  • Third, we'd really like to be able to write our content in Markdown files and post Jupyter Notebooks so that we can share code and visualizations as well as written text.
  • Fourth, we'd like to have our own domain name.
  • Fifth, we'd like to do all this as cheaply as possible ...

In the next installment, we'll use these sketchily defined requirements to go fishing for neat new tech to use to build our blog.

Test Post!

Test Post

Basic Markdown Functions

Text

italics

bold

~strikethrough~

links

blockquotes

Lists:

Ordered
  1. One
  2. Two
Unordered
  • Cat
  • Bat
Checkboxes
  • [x] Checkboxes
  • [ ] Theorem environments

(For some reason, as of the time of writing the checkboxes also have bullets beside them, which is not desired.)

Unicode text

μῆνιν ἄειδε θεὰ Πηληϊάδεω Ἀχιλῆος

οὐλομένην, ἣ μυρί᾽ Ἀχαιοῖς ἄλγε᾽ ἔθηκε,

πολλὰς δ᾽ ἰφθίμους ψυχὰς Ἄϊδι προΐαψεν

ἡρώων, αὐτοὺς δὲ ἑλώρια τεῦχε κύνεσσιν

5οἰωνοῖσί τε πᾶσι, Διὸς δ᾽ ἐτελείετο βουλή,

ἐξ οὗ δὴ τὰ πρῶτα διαστήτην ἐρίσαντε

Ἀτρεΐδης τε ἄναξ ἀνδρῶν καὶ δῖος Ἀχιλλεύς.

Images from the web

Mathematics

The flavor of Markdown I'm using supports $\LaTeX$ typesetting of mathematics via MathJax.

For instance, here are Maxwell's equations:

\begin{align} \nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} & = \frac{4\pi}{c}\vec{\mathbf{j}} \\ \nabla \cdot \vec{\mathbf{E}} & = 4 \pi \rho \\ \nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} & = \vec{\mathbf{0}} \\ \nabla \cdot \vec{\mathbf{B}} & = 0 \end{align}

It doesn't seem to support theorem environments, alas.

Python code

In [1]:
# a comment

import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go 
from urllib.request import urlopen
import json

Data frames

In [2]:
df = px.data.iris()
df
Out[2]:
sepal_length sepal_width petal_length petal_width species species_id
0 5.1 3.5 1.4 0.2 setosa 1
1 4.9 3.0 1.4 0.2 setosa 1
2 4.7 3.2 1.3 0.2 setosa 1
3 4.6 3.1 1.5 0.2 setosa 1
4 5.0 3.6 1.4 0.2 setosa 1
... ... ... ... ... ... ...
145 6.7 3.0 5.2 2.3 virginica 3
146 6.3 2.5 5.0 1.9 virginica 3
147 6.5 3.0 5.2 2.0 virginica 3
148 6.2 3.4 5.4 2.3 virginica 3
149 5.9 3.0 5.1 1.8 virginica 3

150 rows × 6 columns

Charts and Graphs

Line Charts

I'm using Plot.ly for my graphing, because it offers an intuitive, easy-to-use API and because it supports interactive visualizations.

In [3]:
# example graph using the Iris dataset
df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species", marginal_y="violin",
           marginal_x="box", trendline="ols", template="simple_white")
fig.show()

Scatterplots

In [4]:
# the same dataset in 3d

fig = px.scatter_3d(df, x='sepal_length', y='sepal_width', z='petal_width',
              color='species')
fig.show()

Area charts

Another example visualizing topographical data with a color scheme.

In [6]:
# 3D surface example plot from the plotly docs
# Read data from a csv
z_data = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/api_docs/mt_bruno_elevation.csv')

fig = go.Figure(data=[go.Surface(z=z_data.values)])

fig.update_layout(title='Mt Bruno Elevation', autosize=False,
                  width=500, height=500,
                  margin=dict(l=65, r=50, b=65, t=90))

fig.show()

Geodata visualizations

Another cool chloropleth example from the docs that utilizes GeoJSON to visualize the unemployment rate in the US by county.

In [7]:
with urlopen('https://raw.githubusercontent.com/plotly/datasets/master/geojson-counties-fips.json') as response:
    counties = json.load(response)


df = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/fips-unemp-16.csv",
                   dtype={"fips": str})

fig = px.choropleth_mapbox(df, geojson=counties, locations='fips', color='unemp',
                           color_continuous_scale="Viridis",
                           range_color=(0, 12),
                           mapbox_style="carto-positron",
                           zoom=3, center = {"lat": 37.0902, "lon": -95.7129},
                           opacity=0.5,
                           labels={'unemp':'unemployment rate'}
                          )
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()